Goto

Collaborating Authors

 power consumption


LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer Model

Wang, Huizheng, Wang, Hongbin, Wei, Shaojun, Hu, Yang, Yin, Shouyi

arXiv.org Artificial Intelligence

Attention-based Transformers have revolutionized natural language processing (NLP) and shown strong performance in computer vision (CV) tasks. However, as the input sequence varies, the computational bottlenecks in Transformer models exhibit dynamic behavior across stages, which calls for a cross-stage sparse acceleration strategy. Unfortunately, most existing sparse Transformer approaches are single-stage based, and their sparsity prediction mechanisms lead to significant power overhead when applied across multiple stages. To this end, this paper proposes a log-domain attention prediction algorithm-architecture co-design, named LAPA. First, an asymmetric leading one computing (ALOC) scheme is designed to eliminate expensive multiplications. Next, a mixed-precision multi-round shifting accumulation (MRSA) mechanism is further proposed to mitigate the accumulation overhead. A data-feature dependent filter (DDF) strategy is designed to work in concert with the MRSA process. Finally, an elaborate accelerator is designed to translate the theoretical enhancement into practical hardware improvement. Experimental results show that LAPA achieves 3.52x, 3.24x and 2.79x higher energy efficiency than the state-of-the-art (SOTA) works Spatten, Sanger and FACT, respectively.


Cyclical Temporal Encoding and Hybrid Deep Ensembles for Multistep Energy Forecasting

Khazem, Salim, Kanso, Houssam

arXiv.org Artificial Intelligence

Accurate electricity consumption forecasting is essential for demand management and smart grid operations. This paper introduces a unified deep learning framework that integrates cyclical temporal encoding with hybrid LSTM-CNN architectures to enhance multistep energy forecasting. We systematically transform calendar-based attributes using sine cosine encodings to preserve periodic structure and evaluate their predictive relevance through correlation analysis. To exploit both long-term seasonal effects and short-term local patterns, we employ an ensemble model composed of an LSTM, a CNN, and a meta-learner of MLP regressors specialized for each forecast horizon. Using a one year national consumption dataset, we conduct an extensive experimental study including ablation analyses with and without cyclical encodings and calendar features and comparisons with established baselines from the literature. Results demonstrate consistent improvements across all seven forecast horizons, with our hybrid model achieving lower RMSE and MAE than individual architectures and prior methods. These findings confirm the benefit of combining cyclical temporal representations with complementary deep learning structures. To our knowledge, this is the first work to jointly evaluate temporal encodings, calendar-based features, and hybrid ensemble architectures within a unified short-term energy forecasting framework.


Machine Learning to Predict Slot Usage in TSCH Wireless Sensor Networks

Scanzio, Stefano, Formis, Gabriele, Facchinetti, Tullio, Cena, Gianluca

arXiv.org Artificial Intelligence

Wireless sensor networks (WSNs) are employed across a wide range of industrial applications where ultra-low power consumption is a critical prerequisite. At the same time, these systems must maintain a certain level of determinism to ensure reliable and predictable operation. In this view, time slotted channel hopping (TSCH) is a communication technology that meets both conditions, making it an attractive option for its usage in industrial WSNs. This work proposes the use of machine learning to learn the traffic pattern generated in networks based on the TSCH protocol, in order to turn nodes into a deep sleep state when no transmission is planned and thus to improve the energy efficiency of the WSN. The ability of machine learning models to make good predictions at different network levels in a typical tree network topology was analyzed in depth, showing how their capabilities degrade while approaching the root of the tree. The application of these models on simulated data based on an accurate modeling of wireless sensor nodes indicates that the investigated algorithms can be suitably used to further and substantially reduce the power consumption of a TSCH network.


BlinkBud: Detecting Hazards from Behind via Sampled Monocular 3D Detection on a Single Earbud

Li, Yunzhe, Yan, Jiajun, Wei, Yuzhou, Liu, Kechen, Zhao, Yize, Zhang, Chong, Zhu, Hongzi, Lu, Li, Chang, Shan, Guo, Minyi

arXiv.org Artificial Intelligence

Failing to be aware of speeding vehicles approaching from behind poses a huge threat to the road safety of pedestrians and cyclists. In this paper, we propose BlinkBud, which utilizes a single earbud and a paired phone to online detect hazardous objects approaching from behind of a user. The core idea is to accurately track visually identified objects utilizing a small number of sampled camera images taken from the earbud. To minimize the power consumption of the earbud and the phone while guaranteeing the best tracking accuracy, a novel 3D object tracking algorithm is devised, integrating both a Kalman filter based trajectory estimation scheme and an optimal image sampling strategy based on reinforcement learning. Moreover, the impact of constant user head movements on the tracking accuracy is significantly eliminated by leveraging the estimated pitch and yaw angles to correct the object depth estimation and align the camera coordinate system to the user's body coordinate system, respectively. We implement a prototype BlinkBud system and conduct extensive real-world experiments. Results show that BlinkBud is lightweight with ultra-low mean power consumptions of 29.8 mW and 702.6 mW on the earbud and smartphone, respectively, and can accurately detect hazards with a low average false positive ratio (FPR) and false negative ratio (FNR) of 4.90% and 1.47%, respectively.


Introducing AI-Driven IoT Energy Management Framework

Mruthyunjaya, Shivani, Dutta, Anandi, Islam, Kazi Sifatul

arXiv.org Artificial Intelligence

Power consumption has become a critical aspect of modern life due to the consistent reliance on technological advancements. Reducing power consumption or following power usage predictions can lead to lower monthly costs and improved electrical reliability. The proposal of a holistic framework to establish a foundation for IoT systems with a focus on contextual decision making, proactive adaptation, and scalable structure. A structured process for IoT systems with accuracy and interconnected development would support reducing power consumption and support grid stability. This study presents the feasibility of this proposal through the application of each aspect of the framework. This system would have long term forecasting, short term forecasting, anomaly detection, and consideration of qualitative data with any energy management decisions taken. Performance was evaluated on Power Consumption Time Series data to display the direct application of the framework.


Energy Efficient Sleep Mode Optimization in 5G mmWave Networks via Multi Agent Deep Reinforcement Learning

Masrur, Saad, Guvenc, Ismail, Perez, David Lopez

arXiv.org Artificial Intelligence

Dynamic sleep mode optimization (SMO) in millimeter-wave (mmWave) networks is essential for maximizing energy efficiency (EE) under stringent quality-of-service (QoS) constraints. However, existing optimization and reinforcement learning (RL)-based approaches rely on aggregated, static base station (BS) traffic models that fail to capture non-stationary traffic dynamics and suffer from prohibitively large state-action spaces, limiting their real-world deployment. To address these challenges, this paper proposes a Multi-Agent Deep Reinforcement Learning (MARL) framework employing a Double Deep Q-Network (DDQN), referred to as MARL-DDQN, for adaptive SMO in a 3D urban environment using a time-varying and community-based user equipment (UE) mobility model. Unlike conventional single-agent RL, the proposed MARL-DDQN enables scalable, distributed decision-making with minimal signaling overhead. A realistic BS power consumption model and beamforming are integrated to accurately quantify EE, while QoS is uniquely defined in terms of throughput. The proposed method adaptively learns SMO policies to maximize EE while mitigating inter-cell interference and ensuring throughput fairness. Extensive simulations demonstrate that MARL-DDQN consistently outperforms state-of-the-art SM strategies, including the All On, iterative QoS-aware load-based (IT-QoS-LB), MARL-DDPG, and MARL-PPO, achieving up to 0. 60 Mbit/Joule EE, 8. 5 Mbps 10th-percentile throughput, and satisfying QoS constraints 95 % of the time under dynamic network scenarios. I. Introduction The exponential growth in mobile data demand has necessitated increased spectrum availability and accelerated the expansion of cellular network infrastructure. To address the limitations of the sub-6 GHz spectrum, millimeter wave (mmWave) communications, operating within the 30-300 GHz band, have emerged as a key enabler in fifth-generation (5G) networks. With significantly larger bandwidth availability, mmWave technology presents a viable solution to spectrum scarcity challenges [1]. However, mmWave signals suffer from high propagation loss, atmospheric absorption, and susceptibility to blockages, which severely limit coverage and reliability. To address coverage and growing capacity demands, 5G networks rely on densification, deploying numerous low-power mmWave BSs with inter-site distances of a few hundred meters [1]. These BSs utilize large antenna arrays to enable beamforming and spatial multiplexing, often relying on hybrid analog-digital precoding to reduce hardware complexity [2]. However, the RF chain remains a major source of power consumption, particularly the Analog-to-digital converters (ADCs) and digital-to-analog converters (DACs), whose power scales with sampling rate. Due to the higher frequencies and wider bandwidths of mmWave systems, these components require significantly higher sampling rates than sub-6 GHz systems [3], resulting in substantial energy demands.


Power-Efficient Autonomous Mobile Robots

Liu, Liangkai, Shi, Weisong, Shin, Kang G.

arXiv.org Artificial Intelligence

This paper presents pNav, a novel power-management system that significantly enhances the power/energy-efficiency of Autonomous Mobile Robots (AMRs) by jointly optimizing their physical/mechanical and cyber subsystems. By profiling AMRs' power consumption, we identify three challenges in achieving CPS (cyber-physical system) power-efficiency that involve both cyber (C) and physical (P) subsystems: (1) variabilities of system power consumption breakdown, (2) environment-aware navigation locality, and (3) coordination of C and P subsystems. pNav takes a multi-faceted approach to achieve power-efficiency of AMRs. First, it integrates millisecond-level power consumption prediction for both C and P subsystems. Second, it includes novel real-time modeling and monitoring of spatial and temporal navigation localities for AMRs. Third, it supports dynamic coordination of AMR software (navigation, detection) and hardware (motors, DVFS driver) configurations. pNav is prototyped using the Robot Operating System (ROS) Navigation Stack, 2D LiDAR, and camera. Our in-depth evaluation with a real robot and Gazebo environments demonstrates a >96% accuracy in predicting power consumption and a 38.1% reduction in power consumption without compromising navigation accuracy and safety.


Computer Vision for Real-Time Monkeypox Diagnosis on Embedded Systems

Delgado-López, Jacob M., Morell-Rodriguez, Ricardo A., Rosario, Sebastián O. Espinosa-Del, Lugo-Beauchamp, Wilfredo E.

arXiv.org Artificial Intelligence

The rapid diagnosis of infectious diseases, such as monkeypox, is crucial for effective containment and treatment, particularly in resource-constrained environments. This study presents an AI-driven diagnostic tool developed for deployment on the NVIDIA Jetson Orin Nano, leveraging the pre-trained MobileNetV2 architecture for binary classification. The model was trained on the open-source Monkeypox Skin Lesion Dataset, achieving a 93.07% F1-Score, which reflects a well-balanced performance in precision and recall. To optimize the model, the TensorRT framework was used to accelerate inference for FP32 and to perform post-training quantization for FP16 and INT8 formats. TensorRT's mixed-precision capabilities enabled these optimizations, which reduced the model size, increased inference speed, and lowered power consumption by approximately a factor of two, all while maintaining the original accuracy. Power consumption analysis confirmed that the optimized models used significantly less energy during inference, reinforcing their suitability for deployment in resource-constrained environments. The system was deployed with a Wi-Fi Access Point (AP) hotspot and a web-based interface, enabling users to upload and analyze images directly through connected devices such as mobile phones. This setup ensures simple access and seamless connectivity, making the tool practical for real-world applications. These advancements position the diagnostic tool as an efficient, scalable, and energy-conscious solution to address diagnosis challenges in underserved regions, paving the way for broader adoption in low-resource healthcare settings.


PERTINENCE: Input-based Opportunistic Neural Network Dynamic Execution

Shende, Omkar, Ananthanarayanan, Gayathri, Traiola, Marcello

arXiv.org Artificial Intelligence

Deep neural networks (DNNs) have become ubiquitous thanks to their remarkable ability to model complex patterns across various domains such as computer vision, speech recognition, robotics, etc. While large DNN models are often more accurate than simpler, lightweight models, they are also resource- and energy-hungry. Hence, it is imperative to design methods to reduce reliance on such large models without significant degradation in output accuracy. The high computational cost of these models is often necessary only for a reduced set of challenging inputs, while lighter models can handle most simple ones. Thus, carefully combining properties of existing DNN models in a dynamic, input-based way opens opportunities to improve efficiency without impacting accuracy. In this work, we introduce PERTINENCE, a novel online method designed to analyze the complexity of input features and dynamically select the most suitable model from a pre-trained set to process a given input effectively. To achieve this, we employ a genetic algorithm to explore the training space of an ML-based input dispatcher, enabling convergence towards the Pareto front in the solution space that balances overall accuracy and computational efficiency. We showcase our approach on state-of-the-art Convolutional Neural Networks (CNNs) trained on the CIFAR-10 and CIFAR-100, as well as Vision Transformers (ViTs) trained on TinyImageNet dataset. We report results showing PERTINENCE's ability to provide alternative solutions to existing state-of-the-art models in terms of trade-offs between accuracy and number of operations. By opportunistically selecting among models trained for the same task, PERTINENCE achieves better or comparable accuracy with up to 36% fewer operations.


Efficient Learning-Based Control of a Legged Robot in Lunar Gravity

Arm, Philip, Fischer, Oliver, Church, Joseph, Fuhrer, Adrian, Kolvenbach, Hendrik, Hutter, Marco

arXiv.org Artificial Intelligence

Legged robots are promising candidates for exploring challenging areas on low-gravity bodies such as the Moon, Mars, or asteroids, thanks to their advanced mobility on unstructured terrain. However, as planetary robots' power and thermal budgets are highly restricted, these robots need energy-efficient control approaches that easily transfer to multiple gravity environments. In this work, we introduce a reinforcement learning-based control approach for legged robots with gravity-scaled power-optimized reward functions. We use our approach to develop and validate a locomotion controller and a base pose controller in gravity environments from lunar gravity (1.62 m/s2) to a hypothetical super-Earth (19.62 m/s2). Our approach successfully scales across these gravity levels for locomotion and base pose control with the gravity-scaled reward functions. The power-optimized locomotion controller reached a power consumption for locomotion of 23.4 W in Earth gravity on a 15.65 kg robot at 0.4 m/s, a 23 % improvement over the baseline policy. Additionally, we designed a constant-force spring offload system that allowed us to conduct real-world experiments on legged locomotion in lunar gravity. In lunar gravity, the power-optimized control policy reached 12.2 W, 36 % less than a baseline controller which is not optimized for power efficiency. Our method provides a scalable approach to developing power-efficient locomotion controllers for legged robots across multiple gravity levels.